Overview

Dataset statistics

Number of variables18
Number of observations92711
Missing cells5
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.4 MiB
Average record size in memory152.0 B

Variable types

Numeric8
Categorical10

Dataset

DescriptionDashboard of dataset clientes enero
CreatorJose Angel Carballo Sanchez
AuthorMiguel Moreno
URL

Variable descriptions

edadEdad de los clientes.
facturacionDinero que pagan los clientes al mes.
antiguedadFecha de alta del cliente.
provinciaProvincia de los clientes.
num_lineasNumero de lineas moviles contratadas.
num_lineas_impagoNumero de lineas en impago.
incidencia SI = el cliente ha tenido alguna incidencia o reclamacion.
conexionTipo de conexion de internet del cliente.
vel_conexionVelocidad de conexion de internet.
TVTipo de paquete de tv contratado por el cliente.
num_llamad_entNumero de llamadas entrantes de todas sus lineas.
num_llamad_salNumero de llamadas salientes de todas sus lineas.
mb_datosMb de los datos consumidos en todas sus lineas.
seg_llamad_entSegundos consumidos en llamadas entrantes.
seg_llamad_salSegundos consumidos en llamadas salientes.
financiacionSI = el cliente tiene financiado algun terminal.
imp_financEl dinero mensual que paga por los terminales financiados.
descuentosSI = el cliente tiene activado algun descuento.

Alerts

antiguedad has a high cardinality: 92237 distinct values High cardinality
conexion is highly correlated with vel_conexionHigh correlation
vel_conexion is highly correlated with conexionHigh correlation
antiguedad is uniformly distributed Uniform
facturacion has unique values Unique
num_llamad_sal has 936 (1.0%) zeros Zeros
imp_financ has 86045 (92.8%) zeros Zeros

Reproduction

Analysis started2022-05-04 15:40:28.458511
Analysis finished2022-05-04 15:41:04.254840
Duration35.8 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

edad
Real number (ℝ≥0)

Edad de los clientes.

Distinct68
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.42923709
Minimum18
Maximum85
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2022-05-04T17:41:04.519716image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile21
Q134
median51
Q368
95-th percentile82
Maximum85
Range67
Interquartile range (IQR)34

Descriptive statistics

Standard deviation19.58591326
Coefficient of variation (CV)0.380832273
Kurtosis-1.198694328
Mean51.42923709
Median Absolute Deviation (MAD)17
Skewness0.001051985051
Sum4768056
Variance383.6079982
MonotonicityNot monotonic
2022-05-04T17:41:04.828687image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=68)
ValueCountFrequency (%)
371461
 
1.6%
601431
 
1.5%
721426
 
1.5%
531418
 
1.5%
201414
 
1.5%
471404
 
1.5%
241404
 
1.5%
501404
 
1.5%
271402
 
1.5%
321402
 
1.5%
Other values (58)78545
84.7%
ValueCountFrequency (%)
181380
1.5%
191302
1.4%
201414
1.5%
211368
1.5%
221327
1.4%
ValueCountFrequency (%)
851319
1.4%
841289
1.4%
831329
1.4%
821346
1.5%
811379
1.5%

facturacion
Real number (ℝ≥0)

UNIQUE

Dinero que pagan los clientes al mes.

Distinct92711
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean207.4886998
Minimum15.00043941
Maximum399.9984328
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2022-05-04T17:41:05.146622image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum15.00043941
5-th percentile34.47479163
Q1111.3683852
median207.0893664
Q3304.349361
95-th percentile380.6885379
Maximum399.9984328
Range384.9979934
Interquartile range (IQR)192.9809758

Descriptive statistics

Standard deviation111.2394756
Coefficient of variation (CV)0.5361230549
Kurtosis-1.20486072
Mean207.4886998
Median Absolute Deviation (MAD)96.48298181
Skewness0.004058335396
Sum19236484.85
Variance12374.22094
MonotonicityNot monotonic
2022-05-04T17:41:05.466576image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=80)
ValueCountFrequency (%)
216.02810891
 
< 0.1%
361.08783221
 
< 0.1%
244.16447811
 
< 0.1%
178.46190841
 
< 0.1%
222.50373511
 
< 0.1%
328.74168951
 
< 0.1%
359.36587891
 
< 0.1%
260.97369061
 
< 0.1%
238.80833781
 
< 0.1%
133.05201241
 
< 0.1%
Other values (92701)92701
> 99.9%
ValueCountFrequency (%)
15.000439411
< 0.1%
15.004497391
< 0.1%
15.017077411
< 0.1%
15.020459721
< 0.1%
15.022135531
< 0.1%
ValueCountFrequency (%)
399.99843281
< 0.1%
399.99744321
< 0.1%
399.99158261
< 0.1%
399.98529741
< 0.1%
399.98357311
< 0.1%

antiguedad
Categorical

HIGH CARDINALITY
UNIFORM

Fecha de alta del cliente.

Distinct92237
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
01/07/2020 03:55 PM
 
3
01/09/2020 02:33 PM
 
3
01/14/2020 05:08 PM
 
3
01/25/2020 12:51 PM
 
3
01/19/2020 04:57 PM
 
3
Other values (92232)
92696 

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91769 ?
Unique (%)99.0%

Sample

1st row11/23/2018 08:48 AM
2nd row08/22/2017 03:19 AM
3rd row12/27/2001 01:50 PM
4th row08/08/2015 10:53 AM
5th row11/04/1997 11:43 AM

Common Values

ValueCountFrequency (%)
01/07/2020 03:55 PM3
 
< 0.1%
01/09/2020 02:33 PM3
 
< 0.1%
01/14/2020 05:08 PM3
 
< 0.1%
01/25/2020 12:51 PM3
 
< 0.1%
01/19/2020 04:57 PM3
 
< 0.1%
01/07/2020 10:37 PM3
 
< 0.1%
01/15/2020 07:33 AM2
 
< 0.1%
01/18/2008 12:49 AM2
 
< 0.1%
01/27/2020 11:04 PM2
 
< 0.1%
10/14/2003 04:41 AM2
 
< 0.1%
Other values (92227)92685
> 99.9%

Length

2022-05-04T17:41:05.741067image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
pm46394
 
16.7%
am46317
 
16.7%
01/05/2020174
 
0.1%
12:37165
 
0.1%
01:44163
 
0.1%
01/03/2020163
 
0.1%
03:54161
 
0.1%
01/10/2020161
 
0.1%
01:08159
 
0.1%
02:00159
 
0.1%
Other values (9874)184117
66.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

provincia
Categorical

Provincia de los clientes.

Distinct50
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
Valencia
 
1941
Asturias
 
1934
Murcia
 
1931
Navarra
 
1930
Zaragoza
 
1927
Other values (45)
83048 

Length

Max length22
Median length7
Mean length7.603596121
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLa Rioja
2nd rowVizcaya
3rd rowAlbacete
4th rowLugo
5th rowHuelva

Common Values

ValueCountFrequency (%)
Valencia1941
 
2.1%
Asturias1934
 
2.1%
Murcia1931
 
2.1%
Navarra1930
 
2.1%
Zaragoza1927
 
2.1%
Málaga1924
 
2.1%
Alicante1895
 
2.0%
Orense1891
 
2.0%
Guipúzcoa1886
 
2.0%
Zamora1879
 
2.0%
Other values (40)73573
79.4%

Length

2022-05-04T17:41:06.139531image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
la3683
 
3.4%
valencia1941
 
1.8%
asturias1934
 
1.8%
murcia1931
 
1.8%
navarra1930
 
1.8%
zaragoza1927
 
1.8%
málaga1924
 
1.8%
alicante1895
 
1.8%
orense1891
 
1.8%
guipúzcoa1886
 
1.8%
Other values (47)86560
80.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

num_lineas
Categorical

Numero de lineas moviles contratadas.

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
3
29071 
4
25927 
5
22161 
2
12793 
1
 
2759

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5
2nd row3
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
329071
31.4%
425927
28.0%
522161
23.9%
212793
13.8%
12759
 
3.0%

Length

2022-05-04T17:41:06.345761image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-04T17:41:06.473324image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
329071
31.4%
425927
28.0%
522161
23.9%
212793
13.8%
12759
 
3.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

num_lineas_impago
Categorical

Numero de lineas en impago.

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
0.0
90097 
4.0
 
685
3.0
 
652
2.0
 
639
1.0
 
638

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.090097
97.2%
4.0685
 
0.7%
3.0652
 
0.7%
2.0639
 
0.7%
1.0638
 
0.7%

Length

2022-05-04T17:41:06.628323image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-04T17:41:06.755751image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0.090097
97.2%
4.0685
 
0.7%
3.0652
 
0.7%
2.0639
 
0.7%
1.0638
 
0.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

incidencia
Categorical

SI = el cliente ha tenido alguna incidencia o reclamacion.

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
NO
90720 
SI
 
1991

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO90720
97.9%
SI1991
 
2.1%

Length

2022-05-04T17:41:06.916888image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-04T17:41:07.042751image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no90720
97.9%
si1991
 
2.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

num_llamad_ent
Real number (ℝ≥0)

Numero de llamadas entrantes de todas sus lineas.

Distinct251
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean125.1098359
Minimum0
Maximum250
Zeros389
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2022-05-04T17:41:07.231211image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile12
Q162
median125
Q3188
95-th percentile238
Maximum250
Range250
Interquartile range (IQR)126

Descriptive statistics

Standard deviation72.42107473
Coefficient of variation (CV)0.5788599608
Kurtosis-1.196924576
Mean125.1098359
Median Absolute Deviation (MAD)63
Skewness0.003331937553
Sum11599058
Variance5244.812065
MonotonicityNot monotonic
2022-05-04T17:41:07.559719image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=80)
ValueCountFrequency (%)
188443
 
0.5%
243420
 
0.5%
158416
 
0.4%
226409
 
0.4%
62407
 
0.4%
144407
 
0.4%
139406
 
0.4%
159405
 
0.4%
228405
 
0.4%
203405
 
0.4%
Other values (241)88588
95.6%
ValueCountFrequency (%)
0389
0.4%
1375
0.4%
2323
0.3%
3356
0.4%
4349
0.4%
ValueCountFrequency (%)
250390
0.4%
249402
0.4%
248391
0.4%
247373
0.4%
246374
0.4%

num_llamad_sal
Real number (ℝ≥0)

ZEROS

Numero de llamadas salientes de todas sus lineas.

Distinct101
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.85895956
Minimum0
Maximum100
Zeros936
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2022-05-04T17:41:07.880091image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5
Q125
median50
Q375
95-th percentile95
Maximum100
Range100
Interquartile range (IQR)50

Descriptive statistics

Standard deviation29.20854901
Coefficient of variation (CV)0.5858234761
Kurtosis-1.204273467
Mean49.85895956
Median Absolute Deviation (MAD)25
Skewness0.008586540307
Sum4622474
Variance853.1393351
MonotonicityNot monotonic
2022-05-04T17:41:08.203353image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=80)
ValueCountFrequency (%)
100998
 
1.1%
25980
 
1.1%
37971
 
1.0%
14970
 
1.0%
38970
 
1.0%
72968
 
1.0%
35968
 
1.0%
89967
 
1.0%
5964
 
1.0%
24962
 
1.0%
Other values (91)82993
89.5%
ValueCountFrequency (%)
0936
1.0%
1913
1.0%
2907
1.0%
3896
1.0%
4951
1.0%
ValueCountFrequency (%)
100998
1.1%
99895
1.0%
98909
1.0%
97915
1.0%
96906
1.0%

mb_datos
Real number (ℝ≥0)

Mb de los datos consumidos en todas sus lineas.

Distinct24393
Distinct (%)26.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12510.1905
Minimum0
Maximum25000
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2022-05-04T17:41:08.514845image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1250.5
Q16232.5
median12526
Q318742
95-th percentile23748
Maximum25000
Range25000
Interquartile range (IQR)12509.5

Descriptive statistics

Standard deviation7217.671483
Coefficient of variation (CV)0.5769433716
Kurtosis-1.200950925
Mean12510.1905
Median Absolute Deviation (MAD)6260
Skewness-0.003898539754
Sum1159832271
Variance52094781.64
MonotonicityNot monotonic
2022-05-04T17:41:08.837019image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=80)
ValueCountFrequency (%)
1346512
 
< 0.1%
2254312
 
< 0.1%
1813712
 
< 0.1%
2381812
 
< 0.1%
1006312
 
< 0.1%
625712
 
< 0.1%
1218011
 
< 0.1%
435011
 
< 0.1%
804811
 
< 0.1%
1989411
 
< 0.1%
Other values (24383)92595
99.9%
ValueCountFrequency (%)
03
< 0.1%
16
< 0.1%
24
< 0.1%
34
< 0.1%
42
 
< 0.1%
ValueCountFrequency (%)
250006
< 0.1%
249994
< 0.1%
249981
 
< 0.1%
249973
< 0.1%
249966
< 0.1%

seg_llamad_ent
Real number (ℝ≥0)

Segundos consumidos en llamadas entrantes.

Distinct19815
Distinct (%)21.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9985.382781
Minimum0
Maximum20000
Zeros8
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2022-05-04T17:41:09.155946image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1014
Q14960
median9998
Q314981
95-th percentile18998
Maximum20000
Range20000
Interquartile range (IQR)10021

Descriptive statistics

Standard deviation5774.903324
Coefficient of variation (CV)0.5783356983
Kurtosis-1.200962746
Mean9985.382781
Median Absolute Deviation (MAD)5010
Skewness0.004930637947
Sum925754823
Variance33349508.4
MonotonicityNot monotonic
2022-05-04T17:41:09.475630image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=80)
ValueCountFrequency (%)
186316
 
< 0.1%
267616
 
< 0.1%
188314
 
< 0.1%
1413714
 
< 0.1%
397914
 
< 0.1%
1721014
 
< 0.1%
1503613
 
< 0.1%
155713
 
< 0.1%
1831113
 
< 0.1%
885113
 
< 0.1%
Other values (19805)92571
99.8%
ValueCountFrequency (%)
08
< 0.1%
15
< 0.1%
22
 
< 0.1%
34
< 0.1%
43
 
< 0.1%
ValueCountFrequency (%)
200005
< 0.1%
199994
< 0.1%
199984
< 0.1%
199977
< 0.1%
199967
< 0.1%

seg_llamad_sal
Real number (ℝ≥0)

Segundos consumidos en llamadas salientes.

Distinct19798
Distinct (%)21.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10030.44396
Minimum0
Maximum20000
Zeros6
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2022-05-04T17:41:09.819893image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1002
Q15010
median10037
Q315036
95-th percentile19011
Maximum20000
Range20000
Interquartile range (IQR)10026

Descriptive statistics

Standard deviation5786.754197
Coefficient of variation (CV)0.5769190496
Kurtosis-1.203101787
Mean10030.44396
Median Absolute Deviation (MAD)5016
Skewness-0.00466271821
Sum929932490
Variance33486524.14
MonotonicityNot monotonic
2022-05-04T17:41:10.322798image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=80)
ValueCountFrequency (%)
1956215
 
< 0.1%
831515
 
< 0.1%
7514
 
< 0.1%
1476014
 
< 0.1%
1376013
 
< 0.1%
482813
 
< 0.1%
909013
 
< 0.1%
1637113
 
< 0.1%
1387813
 
< 0.1%
1614913
 
< 0.1%
Other values (19788)92575
99.9%
ValueCountFrequency (%)
06
< 0.1%
15
< 0.1%
24
< 0.1%
34
< 0.1%
41
 
< 0.1%
ValueCountFrequency (%)
200004
< 0.1%
199997
< 0.1%
199986
< 0.1%
199973
 
< 0.1%
199968
< 0.1%

conexion
Categorical

HIGH CORRELATION

Tipo de conexion de internet del cliente.

Distinct2
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size1.4 MiB
ADSL
46590 
FIBRA
46119 

Length

Max length5
Median length4
Mean length4.497459794
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFIBRA
2nd rowFIBRA
3rd rowADSL
4th rowFIBRA
5th rowFIBRA

Common Values

ValueCountFrequency (%)
ADSL46590
50.3%
FIBRA46119
49.7%
(Missing)2
 
< 0.1%

Length

2022-05-04T17:41:10.589017image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-04T17:41:10.714602image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
adsl46590
50.3%
fibra46119
49.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

vel_conexion
Categorical

HIGH CORRELATION

Velocidad de conexion de internet.

Distinct11
Distinct (%)< 0.1%
Missing3
Missing (%)< 0.1%
Memory size1.4 MiB
200MB
9342 
600MB
9299 
300MB
9212 
50MB
9167 
100MB
9099 
Other values (6)
46589 

Length

Max length5
Median length4
Mean length4.398584804
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row50MB
2nd row600MB
3rd row35MB
4th row200MB
5th row200MB

Common Values

ValueCountFrequency (%)
200MB9342
10.1%
600MB9299
10.0%
300MB9212
9.9%
50MB9167
9.9%
100MB9099
9.8%
20MB7882
8.5%
25MB7840
8.5%
10MB7807
8.4%
30MB7761
8.4%
35MB7672
8.3%

Length

2022-05-04T17:41:10.852134image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
200mb9342
10.1%
600mb9299
10.0%
300mb9212
9.9%
50mb9167
9.9%
100mb9099
9.8%
20mb7882
8.5%
25mb7840
8.5%
10mb7807
8.4%
30mb7761
8.4%
35mb7672
8.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

TV
Categorical

Tipo de paquete de tv contratado por el cliente.

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
tv-futbol
46191 
tv-familiar
32822 
tv-total
13698 

Length

Max length11
Median length9
Mean length9.560300288
Min length8

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtv-futbol
2nd rowtv-futbol
3rd rowtv-futbol
4th rowtv-familiar
5th rowtv-futbol

Common Values

ValueCountFrequency (%)
tv-futbol46191
49.8%
tv-familiar32822
35.4%
tv-total13698
 
14.8%

Length

2022-05-04T17:41:11.044895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-04T17:41:11.182395image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
tv-futbol46191
49.8%
tv-familiar32822
35.4%
tv-total13698
 
14.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

financiacion
Categorical

SI = el cliente tiene financiado algun terminal.

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
NO
86045 
SI
 
6666

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowNO
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO86045
92.8%
SI6666
 
7.2%

Length

2022-05-04T17:41:11.327956image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-04T17:41:11.448100image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no86045
92.8%
si6666
 
7.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

imp_financ
Real number (ℝ≥0)

ZEROS

El dinero mensual que paga por los terminales financiados.

Distinct6667
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.601432833
Minimum0
Maximum39.99195402
Zeros86045
Zeros (%)92.8%
Negative0
Negative (%)0.0%
Memory size1.4 MiB
2022-05-04T17:41:11.642463image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile15.3968231
Maximum39.99195402
Range39.99195402
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.366160907
Coefficient of variation (CV)3.975290612
Kurtosis17.59324097
Mean1.601432833
Median Absolute Deviation (MAD)0
Skewness4.244524075
Sum148470.4394
Variance40.5280047
MonotonicityNot monotonic
2022-05-04T17:41:11.965207image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=80)
ValueCountFrequency (%)
086045
92.8%
10.243544911
 
< 0.1%
18.599532441
 
< 0.1%
26.950923461
 
< 0.1%
16.986238041
 
< 0.1%
33.633317011
 
< 0.1%
5.8166860741
 
< 0.1%
15.956846331
 
< 0.1%
27.304851371
 
< 0.1%
13.808742371
 
< 0.1%
Other values (6657)6657
 
7.2%
ValueCountFrequency (%)
086045
92.8%
5.0099986641
 
< 0.1%
5.0133093091
 
< 0.1%
5.0214175881
 
< 0.1%
5.0250748751
 
< 0.1%
ValueCountFrequency (%)
39.991954021
< 0.1%
39.990127581
< 0.1%
39.988978141
< 0.1%
39.987564761
< 0.1%
39.978376011
< 0.1%

descuentos
Categorical

SI = el cliente tiene activado algun descuento.

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
NO
72673 
SI
20038 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd rowSI
3rd rowSI
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO72673
78.4%
SI20038
 
21.6%

Length

2022-05-04T17:41:12.218782image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-05-04T17:41:12.338381image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
no72673
78.4%
si20038
 
21.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-05-04T17:40:59.440418image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:41.795940image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:44.307297image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:46.815223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:49.208797image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:51.792087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:54.211320image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:56.695354image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:59.743769image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:42.105269image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:44.611114image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:47.129957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:49.489871image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:52.083772image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:54.519941image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:57.007503image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:41:00.046016image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:42.409678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:44.934898image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:47.464461image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:49.788451image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:52.401047image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:54.834678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:57.372095image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:41:00.342452image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:42.696574image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:45.235053image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:47.745915image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:50.089062image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:52.696540image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:55.144146image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:57.708268image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:41:00.631948image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:43.102493image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:45.538628image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:48.024145image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:50.367632image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:52.990098image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:55.441574image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:58.013582image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:41:00.917953image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:43.388762image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:45.828797image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:48.311892image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:50.649324image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:53.278327image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:55.746277image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:58.305849image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:41:01.228899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:43.700504image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:46.151857image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:48.612204image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:51.169593image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:53.599293image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:56.066253image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:58.630119image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:41:01.527124image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:44.006108image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:46.469651image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:48.912762image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:51.478980image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:53.911478image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:56.382091image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2022-05-04T17:40:59.136966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2022-05-04T17:41:12.473323image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-05-04T17:41:12.803221image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-05-04T17:41:13.146075image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-05-04T17:41:13.505207image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2022-05-04T17:41:13.877916image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2022-05-04T17:41:02.082223image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2022-05-04T17:41:02.887136image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-05-04T17:41:03.578265image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-05-04T17:41:03.863869image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

edadfacturacionantiguedadprovincianum_lineasnum_lineas_impagoincidencianum_llamad_entnum_llamad_salmb_datosseg_llamad_entseg_llamad_salconexionvel_conexionTVfinanciacionimp_financdescuentos
063216.02810911/23/2018 08:48 AMLa Rioja50.0NO95196525763418520FIBRA50MBtv-futbolNO0.000000NO
184255.83084208/22/2017 03:19 AMVizcaya30.0NO443614471145418016FIBRA600MBtv-futbolNO0.000000SI
266135.76815312/27/2001 01:50 PMAlbacete40.0NO9427142852487106ADSL35MBtv-futbolNO0.000000SI
369255.65852708/08/2015 10:53 AMLugo40.0NO186202008373725052FIBRA200MBtv-familiarNO0.000000NO
45199.34864511/04/1997 11:43 AMHuelva40.0NO37321907850098686FIBRA200MBtv-futbolNO0.000000NO
55588.06288306/14/1996 01:44 AMLérida40.0NO78963032511811695ADSL25MBtv-futbolSI31.553269NO
62173.07637707/02/2004 12:35 PMLa Coruña40.0NO183916442777113478ADSL30MBtv-futbolNO0.000000NO
730395.48151403/26/2018 10:22 PMAlicante30.0NO15216171841049311638ADSL35MBtv-totalNO0.000000NO
864391.69219609/15/2004 01:49 AMCórdoba50.0NO9743109611028813798ADSL10MBtv-futbolNO0.000000SI
980199.38044307/26/2011 01:33 AMLas Palmas20.0NO1874114428983714834FIBRA100MBtv-totalNO0.000000SI

Last rows

edadfacturacionantiguedadprovincianum_lineasnum_lineas_impagoincidencianum_llamad_entnum_llamad_salmb_datosseg_llamad_entseg_llamad_salconexionvel_conexionTVfinanciacionimp_financdescuentos
9270155316.72846005/01/2011 11:41 PMAsturias50.0NO143344551185349ADSL10MBtv-totalSI28.355596NO
927027532.29744512/02/2008 03:40 AMTarragona20.0NO961429627619241FIBRA200MBtv-futbolNO0.000000NO
9270358375.65842006/09/2016 09:39 PMSanta Cruz de Tenerife50.0NO141116740459611926FIBRA100MBtv-totalNO0.000000NO
927043215.57068001/18/2013 12:54 PMTarragona20.0NO173583128183402873FIBRA200MBtv-futbolNO0.000000SI
9270565173.74166703/05/2019 12:00 AMMurcia50.0NO421739431008514566ADSL35MBtv-familiarSI23.138779NO
9270636215.89032604/09/2013 01:33 PMGuadalajara30.0NO21796905977358823ADSL30MBtv-futbolNO0.000000NO
9270768285.89075008/08/2003 11:57 PMAsturias50.0NO16899930347983996FIBRA200MBtv-futbolSI14.616422NO
9270820383.16761003/27/2013 08:07 PMÁlava40.0NO1887119018123716720ADSL20MBtv-futbolNO0.000000NO
927095353.30139501/18/2020 02:30 AMSevilla20.0NO13840202641055217637FIBRA50MBtv-futbolNO0.000000NO
927101857.15892710/22/2009 07:17 PMLas Palmas40.0NO217652177214141927ADSL25MBtv-familiarNO0.000000SI